Effects of intersegmental transfers on target location by proteins 
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We study a model for a protein searching for a target, using facilitated diffusion, 
on a DNA molecule confined in a finite volume. The model includes three distinct 
pathways for facilitated diffusion: (a) sliding - in which the protein diffuses along the 
contour of the DNA (b) jumping - where the protein travels between two sites along 
the DNA by three-dimensional diffusion, and finally (c) intersegmental transfer - 
which allows the protein to move from one site to another by transiently binding both 



' at the same time. The typical search time is calculated using scaling arguments which 

Q ■ are verified numerically. Our results suggest that the inclusion of intersegmental 

, transfer (i) decreases the search time considerably (ii) makes the search time much 

more robust to variations in the parameters of the model and (iii) that the optimal 
^ search time occurs in a regime very different than that found for models which 

ignore intersegmental transfers. The behavior we find is rich and shows surprising 



dependencies, for example, on the DNA length. 



o 

22 ■ I- INTRODUCTION 

o 



Many biological processes depend on the ability of proteins to locate specific DNA se- 



■ quences on time scales ranging from seconds to minutes. Examples include gene expression 
and repression, DNA replication and others l|. Naively, one might expect the protein to 
search for its target using only three-dimensional diffusion^. Neglecting interactions of the 
protein with the environment and the DNA (apart from the target site) one then finds, using 
results first obtained by Smoluchowski [sl, that the average search time, t^<^"-^<^^^ is given by: 

A^ 

^search ^ Q\ 

Dsr- ^ ^ 



^ In this paper we only consider proteins whose motion is diffusive and not directed (directed motion could 
result from consumption of, for example, chemical energy and is discussed in Q). 
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Here is the three-dimensional diffusion constant of the protein, r is the target size and 
is the volume that needs to be searched. Assuming a target size of the order of a base-pair 
r 0.34nm, a typical nucleus (or bacteria) size of A ~ lO'^nm and using the measured 
three-dimensional diffusion coefficient for a GFP protein in vivo, D3 ~ lO^nm^/s [3], one 
finds t'^'^^'^'^^ of the order of hundreds of seconds. If proteins are searching for the same 
target the search time is given by^ t^j^"-^'^^ ~ ^search jj^j ^ This suggests that about 10 proteins 
could find a target in reasonable times for cells to function properly. 

In real systems, due to the interactions of proteins with non-specific DNA sequences 
and the environment [sl, the picture is more complex. Indeed, in vitro experiments have 
suggested that mechanisms other than three-dimensional diffusion are used by many proteins 



to locate their targets 
both in the context of in vivo 



0, 0]. These str ateg ies have been studied and debated extensively 
y, 10, [ijand m vitro systems 0, 10, 3, 3, 14, H, 16 1 



and are believed, in general, to allow for search times which are faster than that given by 
Eq. (HD. 

Historically, the first strategy that was proposed combines one-dimensional diffusion (slid- 
ing) over the DNA with intervals of three-dimensional diffusion (typically called jumping in 
this context) (see Fig. [1]). Each individual search mechanism, when applied alone, 

has shortcoming and advantages over the other. When using only three-dimensional diffu- 
sion, the number of new three dimensional positions probed grows linearly in time but the 
protein spends much time probing sites where there is no DNA present. In contrast, during 
one-dimensional diffusion the protein is constantly bound to the DNA but suffers from a 
slow increase in the number of new positions probed as a function of time (~ t^^^, where t 
denotes time) [isl. As shown, for example, in Refs. [s], [l3] by intertwining one and three 
dimensional search strategies and tuning the properties of both one can in fact decrease the 
search time significantly^. 

The combined strategy, while better than the pure search strategies, comes at a cost of be- 
ing sensitive to changes in the properties of either the three-dimensional or one-dimensional 



^ The relation between the search time t'^^'^^'^^ for one protein and search time t^"'"'^'' for N proteins remains 

unchanged throughout the paper. 
^ Clearly, a pure one-dimensional search strategy is not efficient due to the slow diffusive search along the 
DNA, i«earc/i ^ L ^ Q (j^ours), where L ~ 10^ nm is the genome length and Di is the one-dimensional 



-Di 

diffusion coefficient that was measured indirectly 
dimensional diffusion coefficient ^ XQ^iani. 



12 1 and directly [l^li^l to be much smaller than three- 
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diffusive processes. For example, as we argue below, the typical search time changes expo- 
nentially in the square root of the ionic strength. Moreover, given the many constraints on 
the protein to function it is very restrictive to demand optimization for the search process. 
Indeed, equilibrium measurements |2l| and recent single molecule experiment 19, 20] on the 
Lac repressor protein suggest that the search process may not be in general optimized for 
this search strategy. 

A third mechanism which was suggested to speed the search time is intersegmental trans- 
fer (IT) 22, 123(1 . During an IT the protein moves from one site to another by transiently 



binding both at the same time. In principle the new site can be either close along the one- 
dimensional DNA sequence (or chemical distance) or distant (see Fig. [3]). This mechanism 



is likely to be re 
Lac repressor [24 



evant for the proteins that have more than one binding domain like the 



251], GRdbd [26|] and Sfil enzyme [27|. However, it could also occur in 



proteins with a single binding site in locations where the DNA crosses itself . To date we 
are aware of direct evidence for IT only for RNA polymerase 28|] . However, measurement of 



;he dissociation rate from a 
261 ] , CAP and Lac repressor 



abeled (operator) DNA site of the rat glucocorticoid receptor 



291 ] revealed significant dependence on the DNA concentration 



in the solvent, a possible explanation for which is IT. Some theoretical work has suggested 
that in vivo, when the DNA concentration is much larger than in vitro experiments, IT may 



play a determinative role 



16| . These studies focus on the ITs resulting from the DNA 



dynamics and consider the protein to be point like. 

In this paper we present a rather comprehensive study of the effects of ITs on the search 
process for a DNA molecule confined in a finite volume, similar to the in vivo scenario. 
Our work complements previous ones by explicitly accounting for the size of the protein 
and considering two limiting cases: (i) DNA which is completely static during the search 
process and (ii) DNA whose motion is quicker than that of the protein's motion along the 
DNA. Using scaling arguments backed by numerics we obtain expressions for t^^"-^'^^^ and 
the optimal search time (obtained by tuning parameters such as the DNA-protein affinity). 
A central conclusion of this paper is that the search time is much more robust to variations 
in parameters when ITs are allowed^. This is to the extent that in some cases any finite 



^ Of course, this fact may be both advantageous and disadvantageous for the cell. In some cases the 
cell needs transcription factors whose kinetic (and, therefore equilibrium) properties do depend on the 
environment and in other cases it doesn't. 
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FIG. 1: Schematic plots illustrating the different mechanisms that can participate in the facilitated 
diffusion process. Here dashed arrows represent different protein moves, the solid curve represents 
the DNA and a small circle with two legs indicates a protein with two binding domains. The figure 
shows (a) sliding, (b) a correlated intersegmental transfer, (c) an uncorrelated intersegmental 
transfer, (d) jumping. The distinction between (b) and (c) is defined in Sec. (e) The dashed 
(dotted) line represents a one-dimensional (three-dimensional) distance. 

jumping rate can have a negative influence on the search time. In particular, the optimal 
search time is found to occur for parameter regimes very different than the canonical one 
(see Sec. [Tll) found in models which ignore ITs. Perhaps most important, as we show, our 
work suggests that ITs could explain recent findings which indicate a much higher affinity 
of the TF Lac repressor to the DNA than required by an optimal search strategy which uses 



only sliding and jumping 
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2Q|, 



2l|. 



The scaling dependence of the search time on different parameters is rich and very dif- 
ferent from regular facilitated diffusion (involving only sliding and jumping). Consider, for 
example, the dependence of the search process on the length of the DNA, L for a DNA 
confined in a volume A^. Using only sliding and jumping the regime typically thought to 
be relevant to experiment has a linear dependence of the search time on the DNA length L. 
A Smoluchowski-like search time is independent of L. In contrast, when ITs are allowed we 
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find different behavior. We estimate that the regime most relevant to in vivo experiments (in 
prokaryotic organisms) occurs when the dependence on the length of the DNA is weak. For 
example, when a searches are performed using only ITs the search time can be independent 
of L or scales as \/Tj depending on the DNA's dynamics. The scaling behaviors relics on the 
confinement of the DNA in a finite volume (shown in detail in Fig. [9]) and could be used as 
experimental probes for the existence of ITs. 

The paper is organized as follows: Sec. [TTl briefly reviews the main arguments used to 
analyze searches that combines only sliding and jumping. In Sec. Illll the average search time 
is calculated for the case of a strategy based only on ITs for both quenched and annealed 
DNA. In Sec. |IV] a search process that includes ITs and sliding is considered. Sec. |V] 
considers the possibility that the protein can unbind from the DNA (jump) and perform 
ITs. Sec. I VI I studies a model with all three mechanisms. Finally, in Sec. IVIII we discuss 
possible scenarios for the Lac repressor and summarize in Sec. IVIIII 



II. SLIDING AND JUMPING 



To set the stage for a discussion of the effects of IT we consider a search process which 
uses only sliding and jumping. The discussion follows Refs. and [3| closely. We imagine 
a single protein searching for a single target located on the DNA. The search is composed 
of a series of intervals of one-dimensional diffusion along the DNA (sliding) and three- 
dimensional diffusion in the solution (jumping). The typical time of each is denoted by 
Ti and respectively. Following a jump, the protein is assumed to associate on a new 
randomly chosen location along the DNA. While this approach is somewhat simplistic for 
jumps occurring in two-dimensions and below, for three dimensions, which case we consider. 



it is well suited [30| 



Under these assumptions, during each sliding event the protein covers a typical length /, 

Since correlations between the loca- 



where / ~ \JDiTi (often called the antenna size) [18 
tions of the protein before and after the jump are neglected, the search process, completed 
when roughly all the DNA is scanned, is separated into 

A^. ~ ^ (2) 
rounds of sliding and jumping. Here Is is the typical length scanned by the protein during a 
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round. If during the slide the protein does not skip sites on the DNA Ig ~ / (the distinction 
between 1^ and I will become apparent when ITs are introduced). The total time needed to 
find a specific site is then: 

^search ^ ^^^^ ^ ^3^ 

with Tr = Ti + Ts. Using Eqs. ([2]) and (I3j) one obtains 

.search ^ i . \ ^ / ^ , ^3 



Furthermore, it is easy to argue (see Appendix |A]) that 

In Fig. [2] a comparison between the presented scaling arguments and a numerical simulation 
of a search that explicitly includes sliding on a DNA with a frozen configuration in a finite 
volume and three dimensional diffusion is shown (see Appendix[B]for details of the numerics). 
The excellent agreement justifies many of the simplifications made, in particular, the neglect 
correlation between the initial and final location of the jump. Throughout the paper we 
assume this always holds (see Appendix [B|) . 

The analysis leads to a richer range of possible behaviors than found in Eg. ([1]), where 
the search time depends only on the volume in which the DNA is embedded [lO(|. Here, in 
contrast, three regimes are found: (i) For ti <^ there is no dependence on L and the search 
time is given to a good approximation by Eq. ([T]). (ii) For ^ ^ ti ^ the dependence on 
the DNA length is linear. This is the regime typically considered relevant for experiments, 
(iii) For ^ < n one finds t'^""'^^ oc L^. 

It is natural to ask which ti optimizes t'^^°-'^^^_ Using Eq. (j4]) it is easy to verify that 

(-Do = -3, (6) 
where denotes a value obtained with no ITs. Alternatively, one can consider an optimal 



antenna size {lopt)o = V^Div^. When this condition is met, the total search time scales as 



Note that the \/L dependence is obtained by optimizing, say ri, as L is varied. 




FIG. 2: The search time t''"''''''' is shown function of the antenna length, I. The thin line 
represents the results from numerical simulations while the bold one is given by Eq. Numerics 
were performed on a DNA embedded in a finite volume with a frozen configuration. The length of 
the DNA was taken to be 1224000 lattice constants and D3 = Di = 1 (see details in Appendix IB]) . 
Similar results were obtained for different values of Di and 1^3. 



This model, at the optimal ti and assuming known values for Di, L and T3, predicts 
reasonable search times in vivo and is commonly assumed to give a possible explanation for 
the two order of magnitude difference between the experiments in vitro and Eq. ([1]). 

Within the model the optimal search process requires fine tuning of the antenna size, I, 
as a function of the parameters Di and r^. These parameters depend on various cell and 
environmental conditions such as the size of the cell, the DNA length, the ionic strength etc. 



The dependence can be quite significant: for example, t 
dependence on the square root of the ionic strength [ 



le parameter — has an exponential 



3l| . Deviations of this parameter 



from the optimum value might be crucial to the search time since 



opt 



III _|_ /n 
n \/ T3 



Indeed, a strong dependence of the search time on the ionic strength was found in in vitro 
experiments 0]. Interestingly, in vivo, when the DNA is densely packed, no effect of the 
ionic strength on the efficiency of the Lac repressor was revealed 32|. Other experiments 
also suggest that ri is not optimized. In particular, equilibrium measurements 2l|, as well 
as recent single molecule experiment 19|, |20|, find a value of ri for dimeric Lac repressor 
that is much larger than the predicted optimum ts in vivo. 
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The lack of sensitivity to the ionic strength in vivo and the rapid search times found for 
the Lac repressor, even with very large values of ri, suggest that other processes, apart from 
jumping and sliding, are involved in the search process. These seem to be more important 
in vivo than in vitro. In the next section we show that a search process which uses ITs 
modifies the behavior found for searches which use only sliding and jumping in a significant 
manner. In particular the problems encountered above (e.g., high sensitivity to the antenna 
length, very long and non-optimal measured antennas etc.), are largely eliminated when ITs 
are included. 

III. PURE INTERSEGMENTAL TRANSFER 

Before turning to the full problem of a search which uses sliding, jumping and ITs we will 
consider a series of simplified models. Within the first model, considered in this section, the 
protein can only perform ITs. We will see that already at this level many of the problems 
of the search discussed above, which uses only sliding and jumping, are resolved to a large 
extent. 

To model ITs we consider a protein with two binding sites. The protein can either have 
one site bound to the DNA or perform an IT to a new location by having both binding sites 
bound to the DNA (see Fig. [1]). The DNA is scanned for the target by the binding sites, 
each checking a length b when bound (note that since the protein has to align with the DNA 
sequence, b is of the order of a length of a single base-pair). A possible motivation for this 
picture is, for example, the tetrameric structure of the Lac repressor. However, as will be 
evident many results also apply to proteins with different shapes. 

Motivated by DNA in cells, we consider a DNA molecule which is densely packed in a small 
volume. In typical systems the DNA has a total length of L ~ lO^nm, a persistence length 
Lq ~ 50nm, a cross section radius p ~ Inm and is contained in a volume of A^ ~ lO^nm^. 
The typical distance between segments of DNA of length Lq is therefore much smaller than 
Lq-. -qj^ <^ Lq. Under these conditions, using A Lq, it is easy to check that the radius 
of gyration of free DNA, which is of the order of Lq-^J^ is much larger than the cell size 
A - the DNA is densely packed even though its fractional volume, Lp'^/A^, in the container 
is small (about one percent). By way of comparison, typical protein sizes are in the range 
— lOnm, much smaller than the DNA's persistence length. Although in vivo the packing 
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has a more complicate structure than we consider, we expect similar behavior to occur also 
there. 

As stated above the protein moves by first being bound with only one binding site and 
then with both. The typical time for this, defined by 5 = rj, + tjt, is the sum of the typical 
time that protein probes a length b (by being bound with one domain) and the time that 
the protein is bound with both binding domains to the DNA while performing an IT^. We 
assume that the protein moves (for example, using both legs of the Lac repressor) to a 
random position located at a distance smaller or equal to R, the size of the protein, from it 
(see Fig. [1])®. Defining a "chemical" coordinate x which runs along the length of the DNA 
the protein can either perform moves from its location x to the interval [x — R, x + R] (we 
refer to these as "correlated ITs" (CITs)) or reach distant sites along the chemical coordinate 
available through the structure of the packed DNA. 

Under the above conditions it is easy to verify (see Appendix [C]) that almost all ITs 
performed by the protein are either correlated moves or performed to a coordinate along the 
DNA whose distance from its previous location is bigger than ^ (but smaller than L). We 
call these steps "uncorrelated ITs" (UITs) (see Fig. [H^c)). In other words, one can safely 
neglect the possibility that the protein will move using ITs to a chemical distance larger 
than R and smaller than 4-- 

Our main interest is the typical search time. For this purpose it is useful to define A - 
the average length that the protein travels before performing an UIT. On chemical distances 
larger than R but smaller than A the motion is effectively diffusive in one dimension with 
a diffusion coefficient -Die// ~ On chemical distance scales larger than A and smaller 
than L the motion is controlled by UITs. Due to the three-dimensional nature of each UIT 
one expects correlations between different UITs to be negligible. We verify this assumption 
later using numerical simulations. 

From the discussion and using a language similar to that of Sec. II the search process 
can be described as a sequence of 

iV. ~ ^ (8) 

''S 

rounds of correlated ITs where Is is the length scanned by the protein during each round 



^ We take 5 independent of parameters such as cell size A and the DNA length, L. This is justified in a 

regime where most ITs are close along the chemical coordinate of the DNA. 
^ Different scenarios are considered at the end of the section. 
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(namely between two subsequent UITs). The typical time of each round is 



/ A ^ ' 



o M • (9) 



In general while performing CITs the protein can miss regions of the DNA by skipping 
over them. Since each segment of size R is visited ^yW ~ 4 times |18i], when 4 ^ -P the 



walk is recurrent and no sites are skipped so that ~ A. In contrast, when ^ ^ -f the walk 
is not recurrent and L ~ 7^6^ ~ T^^' Therefore the recurrence length, 



Ir-^^ (10) 



separates between two regimes 



A- 



2 



" . (11) 

A A > /r 



the first transient and the second recurrent. 

Using Eqs. ([3]), ([8]), ([9]) and (ITT!) the typical search time is obtained 

'^-^fAV^^l ^«^« . (12) 

^AW \§6 X»ln 

To complete the expression one needs to evaluate A. Its value depends on various parameters 
and, in particular, the time scale which characterize the motion of the DNA. As discussed 
in the introduction we consider two extreme regimes - quenched DNA and annealed DNA. 
In both cases A can be evaluated from an intermediate quantity, p, the probability that 
the protein can make an UIT from a specific location x on the DNA. Since this quantity is 
independent of the DNA's motion we estimate it first before turning to the two regimes. 

To do so, we consider a packed DNA as an ideal gas of straight rods of length Lq that 
are distributed randomly in the cell (see Fig. [3]). The probability pseg, that two given rods 
cross within a distance of R from each other is given by 

^^^^^'^A^If = ^^^^ 

where A is a constant of order unity. Here ^ is the probability that a given segments 
is located within a distance Lq of a point inside the cell and is proportional to the 
probability that this segment crosses a sphere of radius R around the point. Under the 
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FIG. 3: Illustrated schematically is the simplified treatment of the folded DNA. We first represent 
the DNA as the ideal gas of rods each with of a length of one persistence length. Then we connect 
the rods randomly to form a small world network (see text for details). Numerically we find the 
description to work well. 

conditions described above we find that typically pseg -C 1. Finally, to relate p to pseg we 
note that to make an IT at least one segment should be accessible. This yields 

p = l-(l-p,e,)^/^«^l-e-^'x^. (14) 

Eq. f|T^ implies that there are two possible regimes depending on the value of L 



P={ "^'^ " , (15) 

1 L:^ Lr 



where 



A3 

^'^-^2- (16) 

In essence when L ^ Lc (which can occur for example by having a large protein) p ~ 1 and 
about half of the ITs are uncorrelated. However, when L <^ we have that p = A^^ <^ 1, 
and most ITs are correlated. The value of L^. for the range of parameters of interest is of 
the order of lO^nm for very large proteins {R of order of tens of nm, similar to the Lac 
repressor). Therefore in vivo we expect a relatively large L^, so the regime L <^ L^. should 
be relevant^. 



^ In a eucaryotic cell the concentration of DNA is much higher and this statement may be wrong. 
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To summarize the above analysis we note that it effectively represents motion on the 
DNA, using ITs, as motion on a one-dimensional discrete network. The size of each site on 
this network is b, the scanned length on the DNA during one binding event. Each step on the 
network takes on average a time 6. During an IT the protein can move from its position, x, 
to a randomly chosen position in the interval [x — R, x + R] along the chemical coordinate 
(correlated transfer) with probability 1 — p or to an uncorrelated site with probability p 
uncorrelated transfer). Such networks are commonly referred to as Small World Networks 



33| (see Fig. [S^c)). 



To find the relation between A and p one has to consider the dynamics of the DNA. 
Below we consider two extreme cases (a) a completely quenched DNA configuration and (b) 
a strongly fluctuating DNA, which we term annealed. A quenched DNA is static throughout 
the search process. An annealed DNA changes its conformation on time scale much quicker 
than the motion of the protein. 



A. Quenched DNA 

In this section we derive the search time for a quenched DNA. In particular we will show 
that it is has a non-trivial behavior as a function of L. In the regime that is expected to be 
relevant in vivo the search time is independent of the DNA's length (see Fig. H]). 

For quenched DNA one expects that if an UIT can occur at point x it can also happen 
in a region of size R around it. Similar considerations apply to sites where an UIT can not 
occur. The typical distance traveled by the protein along the DNA's chemical coordinate 
between two subsequent UITs, L > A > -R, is of the order of the typical distance between 
two distinct locations where an UIT can occur. This implies for p ^ R/L a scaling of A of 
the form 

A ~ - , (17) 

p 

where p is defined above (see Eq. (fT5|) ) while for p <ti R/L clearly \ = L (see Fig. [5]). 

From the previous discussion one may infer that there are three distinct behaviors as a 
function of L shown on Fig. [5l The first regime occurs for DNA so short that an UIT cannot 
occur during the search. This happens when p <^ R/ L, or equivalently when L <^ Lf, where 
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LogJL] 

FIG. 4: The search time, is plotted function of L, the DNA length for the pure IT case 

with a quenched DNA configuration. The circles represent numerical data, while the solid line was 
obtained using Eqs. (j2ip ■ (j23p and (j24p . The three visible regimes correspond to the three on Fig. 
[5] (see also Fig. |9]). In this plot R and h were taken to be 3 and 1 lattice constants respectively 
(the rest of the details are found in Appendix [B]) . The search time is shown in units of 5. 




FIG. 5: The schematic behaviors of A and Is as a function of L (on a log-log scale) is shown for 
quenched DNA and b R. 
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using Eq. (fT5|) one finds 



In fact, tlie estimate for U{ pushes the hmit of our treatment since the DNA is no longer 
densely packed in this regime. Nonetheless, we find good agreement with numerical simula- 
tions. 

The other regimes occur for L ^ L^, where one has p ^ R/ L. In this case the proteins 
can use UITs during the search. As discussed above there is a length scale separating 
two distinct behaviors of p, and therefore we have three different behaviors for A which are 
given by (see Fig. [5]): 

A ~ I ^ « L « , (19) 

R L ^ ^ = Lc 

where as before Lc = A^/R^. Furthermore, as described above, the scan between two 
subsequent UIT can either be recurrent (A ^ In) or transient (A ^ Iji) with a crossover 
length L^. This length scale is determined by the condition \ (^L = L^j ~ Iji. In the 
recurrent regime the walk between two ITs doesn't skip locations on the DNA. This is in 
contrast to the transient regime where many sites are skipped. Thus using Eqs. ( fTSjl and 
([17]) one finds 

A3 

= -6 . (20) 

For L ^ L2 the search between two subsequent UITs is short and therefore transient while 
for L -C L2 the search between two subsequent UITs is long and therefore recurrent. 

Note that when the search is transient, t'^^"-'^^^ is independent of A (see Eq. (1121) ). There- 
fore, the crossover between two distinct scaling behaviors of t^'^'^^'^^ is governed by the smaller 
of the two length scales Lc and L^. For proteins performing only ITs one expects h to be 
smaller than R. It is easy to see that in such cases is smaller than Lf.. (Other possibilities 
are discussed in Sec. ITVi ) 

To summarize there are two length scales and which separate three possible regimes 
(see Fig. [5l). 

• Regime I: L < 
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In this regime A ~ L. There are no UITs and Eqs. (fTTj) and (fT2|) give 

This regime is clearly not relevant in vivo (using the typical values, A ~ 1/im and R ~ lOnm, 
we find ~ 10/im which is much shorter than typical DNA lengths). 

• Regime II: < L < 

Now the motion between two subsequent UITs is recurrent, Ir<^ \, and Eq. ([T7|) gives 

A~-. (22) 

Using Eqs. (fTT!) and (fT2l) we obtain 

A3 

• search e /r)0\ 

t ~ -^<3 . (23) 

Note that in this regime, as opposed to Sec. [Tll the search time is independent of the DNA's 
length. Eq. (l23l) is equivalent to Eq. ([T]) with an effective three-dimensional diffusion 
coefficient D3 ~ In contrast to the simple three-dimensional diffusive search Eq. ( I23ll 
does not depends on the target size r but rather on the protein size which may be much 
larger. 

• Regime III: L > 

Here A < Ir. and Eqs. ^ and ^ give 

T A 

j.search fnA\ 

t ~ • v24) 

The obtained results, compared to numerics, are summarized in Figs. H] (see also Fig. 
[H]). One can clearly see the three regimes arising for different lengths of DNA which are 
separated by and L^. The details of the numerical simulation are described in Appendix 
[BI Note that and are well predicted by the scaling arguments. 

The most relevant regime for in vivo experiments in prokaryotic organisms is likely to be 
the intermediate regime (II) where the search time is independent of the DNA's length and 
scales as A^. Comparing the search time in this regime 0231) with the minimal search time in 



the case when sliding and jumping are used, Eq. ([7]), one may see that if 5 < R^<^ k-^DiD^ 
search time in the pure IT scenario is in fact smaller than the one of Sec. [Ill which includes 
only sliding and jumping case. This is despite the fact that the protein never unbinds from 
the DNA. 
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B. Annealed DNA 

In this section we consider the annealed case. As we show, here the search time also has 
non-trivial but different than the quenched case behavior as a function of L. In the regime 
that is expected to be relevant in vivo the search time scales as a/L. 

In the annealed case the time scale for a rearrangement of the DNA's configuration is 
assumed to be much smaller than the time of the protein's motion during an IT. As a result 
of the constant rearrangement of the DNA UITs now occur with probability p for each IT. 
The average number of ITs with no UITs performed is therefore of the order of ^ and thus 
the average time that the protein spends between two subsequent UITs is ^ {5 as before 
is the typical time between two subsequent ITs). On one- dimensional length scales smaller 
than A the protein diffuses with a diffusion constant -Die// ~ "x- Therefore, the typical 
one-dimensional distance between two subsequent UITs A is 

A ~ ./ Ae//- - V ^ ^ , (25) 

V P [ R L:^L, 

where Lc is defined in Eq. ( IT6]) . As for the quenched case we will see that again three 
distinct behaviors arise with two crossover lengths. 

The first occurs when no UITs occur. The crossover length can be extracted using 
the condition A (L = L^) ~ L which under our assumptions on the protein's size can only 
occur when L <^ L^. This yields 

It is easy to see that <^ L^. This means that, as expected, in the annealed case the 
effects of UITs become important at much smaller DNA concentration than in the quenched 
case. This happens because fast DNA movements increase the probability to perform an 
UIT. As for , the estimate for pushes the limit of our treatment since the DNA is no 
longer densely packed in this regime. 

The second crossover length occurs when the motion between UITs becomes transient. 
It can therefore be estimated using A (L = L^) ~ Iji. Taking the regime L <^ L^in Eq. (|25|) 
yields 

L^--^- (26) 
For target sizes much smaller than the protein size (6 <^ i?), it is clear that ^ Lc (see 
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FIG. 6: The schematic behavior of A and Ig as a function of L (on a log-log scale) is shown for 
annealed DNA and b <^ R. 

Eq. ( |T6i) ). Hence, using the same arguments as before, only two length scales, and L^, 
determines three possible regimes (see Fig. [6]). 
The three regimes which arise are: 

• Regime I: L <^ 

Here A ~ L. There are no UITs and Eqs. ( |TT|) and ( |T2|) give 

t--^'^ ~ ~ ^5 . (27) 

• Regime II: < L -C 

Here searches between two subsequent UIT are recurrent so that Ir ^ A. Eq. (p5|) gives 

A~^-. (28) 

Using Eqs. (fTTl) and (fT2ll we obtain 



^search _ ^^^5 . (29) 

Here, in contrast to the quenched case the intermediate result scales with the length of the 
DNA as L^/^. Note that the search time is always shorter than that on a quenched DNA. 
This happens because the DNA's movement destroys the correlation in the motion of the 
protein and, therefore, increases the efficiency of the search. A similar dependence on L 
was obtained for a different model |ll|. There, however, the origin of the 
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dependence is different, and is linked to modeling the DNA's motion as diffusion of an ideal 
gas of rods. 

• Regime III: L > 

Here X <ti Ir. Therefore, Eqs. ( fTTl) and ( fT2|) give 

T A 

j.search ^Qn^ 
t ~ • 

The obtained results are summarized later in Fig. O 

The most relevant regime for in vivo experiments in prokaryotic organisms is likely to 
be the intermediate one (II) where the search time scales as L^^^ or alternatively as A^^^. 
Comparing the search time in this regime fl29|) with the minimal search time in the case 
when sliding and jumping are used ([7]) one may see that if 5 < ^^^^^^ the search time in the 
pure IT case is smaller than the one in the sliding and jumping case. This is despite the 
fact that the protein never unbinds from the DNA. 

Numerical simulation of the annealed case require dynamical moves for the whole DNA 
molecule. This is a formidable task for DNAs with reasonable length which is beyond the 
scope of this paper. 



IV. INTERSEGMENTAL TRANSFER AND SLIDING 

Next we consider a protein that can perform both ITs and sliding. Namely, in addition to 
ITs the protein can perform one-dimensional diffusion with only one binding domain bound 
(see Fig. [U^a)). In the language of Sec. Illlt b is now the typical shding length between 
two subsequent ITs. Now each step (distinct from a round defined above), defined as the 
interval between the ends of two subsequent ITs, takes a typical time 6 = + tjt, where 
Di is the one dimensional diffusion coefficient of the protein with only one binding domain 
bound^ and tjt is the typical time that the protein is bound to two DNA segments. 



® The one-dimensional diffusion on the length scales larger than b has a different effective diffusion coefficient 
due to the possibility of a CIT. Thus, to measure Di on large length-scales one should not allow for ITs. 
This may by done, for example, by measuring the motion of the part of the protein that contains only 



one binding domain [l9l . |2C | . 
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If 6 <^ -R it is straightforward to see that the results of the Sec. Illll hold with a redefined 
6. However, in general the sliding length b might be much larger than the proteins size R. 
This is the regime that we focus on in this section. 

Clearly, now the search between two subsequent UIT is always recurrent so that ~ A. 
Here as before A is the typical distance traveled by the protein between two subsequent 
UITs. However, now -Die// ~ where as above 6 = + tjt- The search time as a 
function of A, similar to Eq. f|T2|) . becomes 

.search L \^ LX 

The value of A, as in the previous section, depends on the dynamics of the DNA molecule. 
Again we consider two extreme cases (a) quenched DNA and (b) annealed DNA. 



A. Quenched DNA 

To obtain A we first introduce a new quantity, Aq, defined as the typical chemical distance 
between two locations in which the protein can perform an UIT. Note that we are interesting 
in the regime b ^ R. Therefore the values of A and Aq may be distinct since an UIT is 
not necessarily performed at every possible location on the DNA. Clearly, however, the 
functional behavior of Aq is identical to that of A in the previous section. This yields (see 
Eq. M) 



An 



, (32) 

R L > - r 



where we have used the definitions of and Lc of the previous section. 

Similar to the derivation of Eq. ffTOl) . when Xo/b ^ b/R, the effective random walk of 
the protein along a length Aq is recurrent. Here recurrent motion implies that sites where 
an UIT can occur are visited many times before a neighboring site where an UIT can occur 
is met (note that this is distinct from the recurrent behavior of Sec. IIII|) . In the recurrent 
regime a location of a possible UITs is visited many times and therefore not missed. In 
this case A ~ Aq. In the opposite transient regime (again distinct in meaning from that 
used in Sec. IIIip . ^ and the protein performs an UIT only after it travels a distance 

A ^ Aq. In the latter regime each IT has a probability ^ to be an UIT. Therefore between 
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FIG. 7: The schematic behavior of A and Ig as a function of L (on a log- log scale) is shown for 
R 



quenched DNA, L > ^ and b :^ R. 



two subsequent UITs the protein performs ^ ITs. Using the diffusive nature of the motion 
we find A ~ b\J~^- The value of A as function of Aq is shown schematically in Fig. [71 

Combining the three regimes of Aq with the above mentioned crossover from A ~ Aq to 
A ~ ^\f^ (which occurs at L = one finds, using b/R ^ 1, four regimes for the 

search time: 

• Regime I occurs for L ^ corresponding to A ~ Aq = in Eq. (1321) . Using Eq. 
(13T|1 gives 

^search _ _g _ (33^ 

• Regime II occurs for |^ ^ L ^ and A ~ Aq ~ Using Eq. (I3T!) yields 

^search _ , _ /34^ 

b^R 



Regime III occurs for < < L^. Now A ~ 6y ^ ~ iW' Using Eq. 1^ we 
find 

j.search vA^L /Qtr\ 

t ~ -^6 . (35) 
Regime IV occurs for L L^. Here A ~ by ^ ~ & and with Eq. (15Ti) one gets 



^search ^ _ (35) 
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FIG. 8: t^^"''^'' is plotted as a function of L, the DNA length for a model with IT and sliding for 
a quenched DNA. The thin line with dots represent numerical data, while the bold solid line was 
obtained using Eqs. ([33]) . ([3^ . ([35]) and ([36|) (see also Fig. [9]). In this plot R and b were taken to 
be 1 and 20 lattice constants respectively (the rest of the details could be found in Appendix IB]). 
The search time is shown in units of 6. 

Fig. [8] shows a comparison between the four theoretically predicted regimes and the 
numerical simulation of the model. Three regimes are reproduced by the numerics while the 
fourth one was not reproduced due to computational limitations. 

For a moderate values of tjt one may see that long sliding may drastically decrease 
the efficiency of the search. This occurs because long sliding prevents both UITs that de- 
stroy correlations in the search process and CITs that increase the effective one-dimensional 
diffusive constant. 
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B. Annealed DNA 

Here using the arguments presented in Sec. IIIIB| the average number of steps performed 
between two subsequent UITs is of the order of ^ where p is given in Eq. f[T5]) . This imphes 
a typical time between the subsequent UITs of the order of ^. Using the fact that along the 
DNA the motion of the protein is diffusive with an effective diffusion constant -Die// ~ ^ 
one finds 

A ~ y^Ae//^. (37) 

Clearly, A can only take values in the range b < X < L. These with the possible values of p 
(see Eq. f lTSj) ) define the borders of the following three regimes: 

• Regime I occurs for A ~ L. Using Eq. (!37j) and p = LB? / it can be verified that 
this regime occurs when L <^ K (■|)^^'^- In this case no UIT occur during the search 
and Eq. (|3Tll gives 

^search _ _^ _ (gg) 

• Regime II occurs when A (;|)^^^ <ti L <ti L^,. Using Eq. fl57|) and p = LR'^/A^ one 



finds that in this case A ~ by j^. Using Eq. (13T1) gives 

j.search * X tort\ 

t ~ -^5 . (39) 

• Regime III occurs where L ^ and almost all ITs are UITs. Here A ~ 6 and p ^ 1 
so that Eq. fl^ gives 

^search ^ /^qN 

b 

The obtained results are summarized in Fig. [91 

One may see that in the case of long sliding, rapid DNA motion cannot decrease the 
search time significantly as in the pure IT case. This is because long sliding prevents fast 
decay of correlations. 



C. Motion with no CIT 



Here we consider a case where the structure of the protein causes it to prefer UITs over 
CITs. This may occur, for example, in cases where the "legs" of the protein are antiparallel 
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FIG. 9: In this figure the schematic behavior of t^^°''^'^"- as a function of the DNA length L is 
shown in absence of the jumps, (a) shows short sliding results (6 ^ R). (b) shows long sliding 
results {by^ R). 

and rigid. The motion on length scale smaller than A is then diffusive involving only sliding 
with a diffusion coefficient Di. In this case, clearly Ig = X and the time between two 
subsequent UITs is given + tjx where tjt is the time of an UIT. One finds, similar to 
Sec. M 

The relationship between this and the picture of Sec. [Til is given by identifying the antenna's 
length / with A and the three-dimensional diffusion time with tjt- 



Most of the results of Sees. IIIII and IIVI are summarized in Fig. [91 The results of this 
section indicate that ITs may supply reasonable search times if they are quick enough. 
Combining IT with sliding we see that even rare UIT events may break correlations created 
by one- dimensional diffusion. In this sense ITs act as jumps without the need for detachment 
from the DNA. Besides this, CITs may effectively accelerate the one-dimensional diffusion 
or even replace it altogether. 
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V. INTERSEGMENTAL TRANSFER AND JUMPING 

We now turn to consider the effect of jumping on the results described above. Before 
addressing the full problem, including ITs sliding and jumping, we first consider a model 
in which only ITs and jumps occur, and ignore sliding. To include jumping we assign 
a probability ^ for a protein to detach from the DNA during a time interval dt. The 
unbinding initiates a jump in which the protein uses three-dimensional diffusion to rebind 
at a new location on the DNA. Note that since there is no sliding it is safe to assume 6 -C i?. 

As argued in the previous section, it is reasonable that both UITs and jumps move the 
protein to a new location which is chosen randomly on the DNA. Therefore, the search pro- 
cess is composed of a series of one-dimensional scans (occurring through CITs) of the DNA 
interrupted by uncorrelated relocations. The uncorrelated relocations can occur through two 
independent processes: jumps and UITs. The typical search time can be evaluated using an 
approach identical to that of the previous sections. 

First, we need to estimate the typical time Tie// between two uncorrelated relocations. 
Combining, the previously derived typical time between two subsequent UITs, 2Z^~) ^^'i 
the typical time between jumps ri we obtain^ 

^le// - 2D, ff 2D, fJ ' (42) 
~P ^ 

where A, defined before, is the typical distance that the protein travels between two subse- 
quent UITs and we define an antenna length / = ^j2D^[^JfT\. 

Here and in the next section we focus on the search time as a function of /. This quantity 
is influenced by the protein-DNA non-specific binding energy and governs the frequency of 
jumps. Other parameters that do not depend on /, such as A, are taken as fixed. The value 
of A relevant for the discussion here is given in Sec. IIIII , where b <^ R. Note, that when 
incorporated in the results below the resulting behavior is very complicated. While this is 
easy to obtain we skip all the regimes and focus on important qualitative behavior. 

To proceed we note that the typical distance between two uncorrelated relocation events 



^ This expression is exact in the annealed case but it is only an approximation in the quenched regime. 
However, the error does not exceed 50% (see Appendix ID II for details). 



25 



is given by 

leff = y^2Dieffrieff ^ iJ ■ (43) 

V + A2 

As expected, and seen in Eqs. (1421) and (143|) . the relative importance of both mechanisms 
is controlled by the ratio j. In the case of //A ^ 1 jumping is rare compared to UITs and 
may be neglected leading to the behavior found in Sec. Illli In the opposite case //A ^ 1 
the possibility of performing an UIT is negligible and the results of Sec. HTlhold. 

Finally, we must estimate the average time spent by the protein performing one uncor- 
related relocation. This is given by the average of the jump time, t-^, and the time of an IT, 
weighed with the probability of performing each. This gives 

Tleff , ^ Tie// 
/2 / /2 \ -r -J- A'^ 



where is the probability of a jump, 1 — = i/^^"// probability of an UIT 

and 6, defined above is the time of an IT (see Appendix ID 21 for a more detailed derivation). 

The total search time, as before, takes the form of Eq. Q. Now, each search round is 
defined as the interval between two subsequent uncorrelated relocations. The total time of 
one round is ~ Tie// + Tse//, and therefore the search time is given by 

^search ^ ^^^^ ^ L ^^^^^^ ^ ^^^^^^ ^^^^ 

Here Ig is the length scanned between two subsequent uncorrelated relocations. In the case 
discussed here b R, and the value of Is depends on the properties of the search between 
two uncorrelated relocations, namely the ratio of leff and Ir, the recurrence length (see Eq. 
f fTOj) and the relevant discussion). If l^ff ^ Ir the search between two subsequent jumps is 
recurrent and Is ~ hff- However, in the opposite regime, / <^ Ir, Is ~ -|^. 
Therefore, for a given A there are two regimes (see Fig. [TO] and [TTl) : 



Regime I {Ir <^ I 



eff) 
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In this regime, using Eq. (H5|) . the total search time is 
where we used X ^ R. 



(46) 



Comparing with Eq. (JSj) we note that here we have both an effective diffusion constant 

/ 2\-V2 

and an extra enhancement factor given by ( 1 + p- ) . As we now show, this factor has 
important consequence. 

Consider the value of ri = 7777- — for which a minimal search time is obtained and compare 
it with the usual paradigm of (t^^*)q = T3. Due to the enhancement factor we now find 



1 - 



A2 

where (t°^*)q = T3 (see Eq. (E])) is the optimal antenna size in absence of ITs (A 00) 
(see Sec. HI])- It is interesting to note that Z°^* approaches infinity when is larger than a 
critical value 

= ^ . (48) 

Hence, the minimal search time for r-^ > t^^i is identical to that with no jumps (see Sec. IIIip . 
It is important to note that t^c depends, as expected, on the time of an IT through -Die//- 
In the case when ra < Eqs. ( H6l) and ( 1471) give 



In this regime t'^^^^'^^ is monotonically increasing in t^. 

In Fig. [12] we show a comparison between the results of numerical simulation and Eq. 



• Regime II (/e// < Ir) 
In this case Ig ~ -j^ and Eq. ( H5i) yields 



,«-.|^ 3.„).^^_,, 3, . (50) 
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FIG. 10: Possible regimes as a function of / and A are shown in the case of ITs and jumping (or IT, 



jumping and sliding with b <ti R) for //j <^ ^j2D\^JjTj,. The gray (white) area represents regime I 
(II). The dashed hne represents the optimal antenna length. The optimal antenna length in the 
absence of IT is equal to \/2D\Tz- 

Interestingly, in this regime the minimal search time is obtained when t\ diverges ; This 
means that jumping only increase the search slower in this case. We note that some care 
needs to be taken with the limit since if X > Ir and the value of / exceeds Ir the regime 
leff <^ Ir transforms into Regime I. 

The results of this section highlight several interesting features which will also appear 
in the more general case, where sliding is also allowed. First, we note that in the limit 
of very strong protein-DNA affinity (large values of ti) the search time becomes robust to 
changes in the value of ti. This is very different from a search process with no ITs (see Eqs. 
( I46l) . ( l50l) and Fig. [T2l) . and may give a possible explanation to the difference between in 
vitro experiments on the Lac repressor [7]. There a strong dependence of the search time 
on ionic strength (and therefore on the protein-DNA affinity) was found. However, in vivo 



experiment [32| found that the efficiency of the repression by the same protein is very robust 
to changes in the ionic strength. 

Furthermore, by examining the optimal search time, we find that beyond some critical 
value of Ta jumps increase the search time (see Fig. [12] for demonstration). This may give 
a possible explanation of the obtained value of ri in vitro [19|] and in vivo for the Lac 
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/ l°p' = 




FIG. 11: Possible regimes as a function of / and A are shown in the case of ITs and jumping (or IT, 
jumping and sliding with b <^ R) for S> y^TDi^fJr^. The gray (white) area represents regime I 
(II). The dashed line represents the optimal antenna length. The optimal antenna length in the 
absence of IT is equal to \/2DiTs. 

repressor. These are much larger than the optimal ri predicted by models that do not 
include ITs. 

In Fig. [12] a comparison between Eq. f H6l) and numerical simulation is shown. One may 
see that increasing the value of increases the optimal value of I (or equivalently ri) in 
such a way that above some critical value, predicted by Eq. ( HHl) . it becomes infinite. 

VI. INTERSEGMENTAL TRANSFER, SLIDING AND JUMPING 

With the results of the previous section it is straightforward to consider the general case 
where ITs, sliding and jumping are allowed. Similar to the previous section we show that 
jumping may slow the search process significantly. However, ITs make the search process 
much more robust to variations in parameters. 

First consider the case h <^ R where sliding events are very short. Clearly, in this case 
the results of the previous section hold with 5 = + tjt- Here as in Sec. HVj Di is the one 
dimensional diffusion coefficient for sliding and t/t is the typical time that the protein is 
bound to two DNA segments. With this in mind, in this section we discuss only the opposite 
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FIG. 12: The influence of ITs on the search time is shown. The search time, i'^e^^cft,^ 
is plotted as a function of the antenna length, /, for a different values of T3 
(140, 1400, 14000, 1400000, 5000000, 14000000 in units of 6 from bottom up). Here only ITs 
and jumping are allowed. Thin solid lines represent the numerical results. The bold solid lines 
represent analytic results (Eq. (|46p ). The black, dashed lines represent the search time in the case 
with no ITs, obtained by using Eq. ^ with the effective diffusion constant -Die// = ^ instead 
of Di. Here L, R and h were taken to be 1224000, 1 and 1 lattice constants respectively. Since 
R = b = 1 diffusion through sliding is identical to one through CITs. This allows us to directly 
compare sliding and jumping with ITs and jumping. 

case of b ^ R. Here, as in Sec. |Vl the parameters that do not depend on /, such as A, are 
taken as given. In Sec. IIVI contains the relevant derivation of A is calculated for the case 
discussed here of long sliding, b ^ R. 

As shown in Sec. |IV]in this case -Die// ~ ^ with S = + tjt- Following Sec. IVl we 
first need T^eff, the typical time of an uncorrelated relocation. This is given by (see the 
derivation of Eq. (jHj) and Appendix ID 1\\ 

^3e// = ^ , 12 ■ (51) 

Note that here, since b ^ R, the search between two subsequent uncorrelated relocations 
is always recurrent and therefore Ig ~ leff- Therefore, similar to Sec. |Vl the search time is 
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given by 



.search ^ / , ^ N ^ ( ^eff ^3 + , . . 



Using Eqs. fl43l) and flS^ . the total search time can be written as 



~ + r,^ + -3 ) • (53) 



A2 

Again, it is interesting to consider the optimal value of ri 



1 -, 2r3 — T/T ' 

(54) 



where (ti'"*)q = T3 (see Eq. ([6])) is the optimal antenna size in absence of ITs (A 00). 

Interestingly, Eq. (15^ shows that the optimal r°^*, may either be smaller or larger 
than iji^^^f^ depending on the time of an IT, t/t. It is also noteworthy that when 2t-^ > 
X'^/2Dif.ff + TjT the optimal ti value becomes infinite. Namely, jumping makes the search 
process slower. This is similar to the behavior found in Sec. |Vl and again the critical value 
of depends on microscopic quantities such as the time of an IT. 

The minimal search time obtain is 



.search ) A V ' ^ V ' > / — iej J ' 'n ' i> 2 f^^s 

^opt - \ r ( 1 ^ rjA ^ ^ XV2D,^ff+rjr ' ^^^^ 



We stress again that it is clearly seen that jumping may slow the search considerably. Note 
that again the optimal value of ti is very different than the canonical one discussed in Sec. 
iniFig- [13] shows a comparison between the theoretically predicted search time (Eq. fl53l) ) 
and numerical simulation. 



VII. APPLICATION TO THE LAC REPRESSOR 



The above results cover a very wide variety of regimes. For a given protein only several 
are of interest. To illustrate the use of the results presented above we consider Lac repressor. 
Lac repressor is both the most studied DNA-binding protein (see 3J] for a review) and its 
structure is highly suggestive of intersegmental transfers taking place. Despite of this several 
physical parameters of theprotein are yet unknown. In this subsection we use the known 



parameters: R ~ lOnm 



351 ]. A ~ L ~ 1mm, and those measured for Lac repressor 
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FIG. 13: The influence of ITs on the search time is shown. The search time, t^^^^^c/i^ jg plotted vs. 
the antenna length, /, for a different values of T3 (10, 100, 1000, 10000, 100000 in units of b from 
bottom up). Here ITs, jumping and sliding are allowed. Thin solid lines with dots represent the 
numerical results. The bold solid lines represent analytic results (Eq. (|53p). The black, dashed 
lines represent the search time in the case with no ITs, obtained by using Eq. ([5]) with ^le// = 
Here L, R and h were taken to be 1224000, 1 and 20 lattice constants respectively. 

with only one DNA-binding domain r\ ~ 1ms, rs ~ O.lri and D\ ~ 0.05yU^/s fiol. I20I. Still 
unknown are 6, the sliding length, and tit which we use as free parameters and study the 
search time as these are varied. It is interesting to note that Lac repressor is so large that, as 
we show, essentially all ITs can move the protein at each step to a completely uncorrelated 
location on the DNA. 

Fig. [H] shows the predicted t^'^"-^'^'^ from Sees. IVl and IVTl as a function of b and r/^. One 
may see that for b ^ R, ITs do not affect the search time significantly even if tjt is small. 
This is results from the small probability of performing UIT for a large values of b. However, 
if b <^ R the search time may be decreased in a significant manner by including ITs. For 
example, by setting b to be the size of one base pair ~ 0.3nm the search time decrease by a 
factor of three when tjt = ^3 and if tjt = ^ the search time decreases by a factor of ten. 
Finally, Fig. [T3] shows that for large values of t/t, ITs may slow down the search process. 
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FIG. 14: On this figure the analytical prediction of t^'^'^ is shown as a function of the unknown 
parameters b and t/t- 

VIII. SUMMARY 

In this article we presented a comprehensive study of the influence of ITs on the search 
process. Using simple scaling arguments we studied a model which includes the protein dy- 
namics and DNA conformation. Two extreme regimes for the DNA dynamics were studied: 
completely quenched (frozen) and annealed (rapidly moving) DNA. ITs were assumed to 
relocate the protein to a randomly chosen DNA position within a range of the order of the 
protein size. The essence of the description may be understood from Sec. IIIII The following 
sections elaborate and study a search processes based on ITs with sliding and/or jumping. 
The results for a particular protein of interest may be obtained by suitably selecting the 
section most relevant for a particular case. 

The obtained results clearly indicate that including IT in the search process may increase, 
the robustness of the search efficiency to different parameters of the model such as the 
protein-DNA affinity, the three-dimensional diffusion coefficient etc. 

The mechanism of IT may produce a significant increase of the optimal residence time of 
the protein on the DNA between two subsequent rounds of three-dimensional diffusion from 
the value predicted by the models that do not include IT. Recent experiments indicates that 
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the value of the residence time of the proteins on the DNA between two subsequent rounds 
of the three-dimensional diffusion is much larger than the optimum predicted by the model. 
It is possible that the existence of the IT mechanism may explain the rather quick search 
times found in vivo experiments. 

One of the most surprising results found that above some critical value of the typical 
time of a jump the protein has no reason to detach from the DNA. It is more efficient for it 
to stay bound to the DNA. The value of the critical jump time depends on the time of an 
IT. 

A key ingredient needed for the behavior to occur is the confinement of the DNA in 
a volume much smaller than its radius of gyration. The probability to perform an UIT 
obviously depends on the DNA density. Larger density implies a larger probability for 
UITs. Therefore the effects of IT are expected to be more important in the systems with 
high DNA density as cells or eucaryotic nuclei rather than in the in vitro experiments. 

The dependency, mentioned above, on the DNA density leads to many possible regimes 
which depend on the cell size, DNA length etc. In particular, we found non-trivial regimes 
when the search time increases as a square root of the DNA length or is completely in- 
dependent of it. Our estimates indicate that these seem to be the ones most relevant to 
experiments. 

Our results also show that the search on quenched and annealed DNA may have quite 
different scaling behavior. In general a search that uses ITs is shown to be more rapid on 
an annealed DNA than on a quenched DNA. This happens due to the rapid decrease in 
correlations which results from the motion of the DNA molecule. 

Similar scaling arguments were used to discuss the effects of IT in llj. However, there 
the main mechanism that drives the IT was assumed to be the motion of the DNA molecule. 
In our study even on completely quenched DNA ITs are shown to be important. 
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APPENDIX A: 



In this appendix we argue that the typical time that the protein spends in a jump is 
given by ra ~ This quantity is controlled by average volume which is free from DNA. 
Consider, first, the probability to find a volume, free from DNA of radius s. To do so we 
describe the packed DNA as an ideal gas of straight rods of length Lq that are distributed 
randomly in the cell (see Fig. [3]). The probability Pseg, that a given rods crosses a volume 
of radius s is of order of -^Ir = Here is the probability that a given segments is 

2 

located within a distance Lq of a point inside the cell and ~ is the probability that this 
segment crosses a sphere of radius s around the point. The probability that at least one 
segment crosses the void is 

l-(l-p,e,)^/^°^l-e-'x^. (Al) 

Therefore the typical free volume radius is ~ \f^- Hence, the typical time to explore^° this 
volume is ~ A second way to get the same expression for is based on a comparison 
between Eqs. (II]) and (jlj). Obviously, in the limiting case ri ^ t-s and y/2DiTi = r, the 
search becomes based only on the three-dimensional diffusion. Hence, in this case the formula 
dl]) should give ([1]). It is easy to see that this happens only when ~ -^-j^. 



APPENDIX B: 



In this appendix we describe the details of the numerical simulation. The simulations 
were done on a cubic lattice containing 800 x 800 x 800 sites. Assuming that a real cell has 
a volume of Ifim^ each site on the lattice represents a volume of (dx)^ = (f^)^- Polymers 
(representing the DNA) with different lengths were embedded in the lattice by using a self- 
avoiding random walk. The persistence length was accounted for by assigning a probability 
Po of changing direction randomly among the possible directions. Using the persistence 
length of about 50nm leads to po = = 0.025. If during the process of generating 

the configuration the polymer length can not be extended we shrink the polymer by O (10) 

In the three-dimensional space diffusive exploration is not compact i.e. the probability to find a finite 
target (sphere) is less than one. However, the DNA as a target may be described as a set of straight rods. 
Hence, the search process effectively looks like the two-dimensional search for a finite target (disk) i.e. 
compact (up to logarithmic corrections). 
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lattice constants and regenerate. While this leads to a bias in the configuration for single 
realization confined in a box, which are of interest, we expect no effect on the results (a non 
biased algorithm is not plausible within our computational resources). The search process is 
simulated following the model described in the text. In each step the protein has a probability 
^ to perform an IT and a probability ^ to perform a jump. ITs were simulated by a 
randomly choosing a DNA site within a distance R from the location of the protein. With 
the exception of Sec. [Tll where a complete simulation of the three-dimensional diffusion 
was carried out by performing moves to the 6 available directions, a jump was simulated by 
randomly choosing a site on the DNA. The time of the jump was taken as a free constant 
()• 

APPENDIX C: 

In this appendix we argue that using ITs the protein can only move along the chemical 
coordinate to distances smaller than R or larger than As mentioned above, we assume 
that during an IT the protein chooses a new location whose three-dimensional distance from 
its current location is smaller than R. The new location is chosen randomly with a uniform 
probability. Given the uniform probability we need to estimate the total typical length 
available at each IT, G. We separate this quantity to four types of contributions: 

G = Gi + G2 + G3 + G4 . (CI) 

The first Gi is the contribution from DNA whose distance along the chemical coordinate 
from a point x is smaller than i?, the protein size. This is given by 

Gi (x) ~ 2R . (C2) 

The contribution G2 arises from DNA whose chemical distance from a point x is larger than 
R but smaller than Lq. The probability for the DNA to bend on a scale / is approximately 
given by ^— ^xir~~^' However, the probability that this bend will connect to x is ~ (due 
to the area ratio). Since each connection contributes a length of the order of R to G2 we 
obtain 

R 
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The contribution G3 comes from DNA whose chemical distance from x is larger than Lq 



but smaller than the length at which the DNA feels the boundaries of the cell is ~ 



This value can be overestimated using the fact that a free three-dimensional random walk 
on a lattice returns to the origin about 1.5 times on average. Therefore, a continuous free 
three-dimensional random walk with persistence length Lq returns to a region with radius 
R an order of ^x^^ times. Each such return contributes length of about R to G3, leading 
to 

G3 ~ ^ . (C4) 

Finally, is the contribution from the rest of the DNA (whose chemical distance is larger 
than ^ but smaller than L). Using f|T3l) and since each connected segment contributes a 
length of the order R to G/i one obtains 

L LR^ 

Ga ~ R—Pseg ~ -r§- • (C5) 

This result can be understood within a mean field approach: if the DNA has a total length 
L and is assumed to be distributed uniformly in the cell, every volume in the cell contains a 
part of the total DNA length that is equal to the total DNA length times the fraction of the 
volume. One can see that in the assumed regime where Lq ^ R and L ^ Aj^, G2 and G3 
are much smaller than G4 and Gi. Therefore, we can safely neglect the probability that the 
protein will move to a location on the DNA whose chemical distance from protein's actual 
location is larger than R and smaller than 

APPENDIX D: 

In this appendix the effective times Ti^ff and r^eff are calculated. 



1. The effective time of a correlated movement 

We have two independent mechanisms for an uncorrelated motion. The first is jumping 
with a typical time of ri between two subsequent jumps. This process has Poissonian 
statistics and therefore the probability that the protein does not perform a jump before 
time t is 

Pj = exp(-^] . (Dl) 
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The second mechanism for uncorrelated motion is an UIT with a typical time of order of 

e// 

Poissonian statistics and the probabihty that the protein does not perform an UIT before 



between two subsequent UITs. In the case of annealed DNA this mechanism has 

-Die// 



time t is 

For quenched DNA the probabihty that the protein did not performed an UIT after traveling 
distance x is ~ e~^l^. Since the protein performs an effective one-dimensional diffusion, 

X ~ ^y2Dlefft and we obtain 



Pit ~ exp I — 




t 



xy2D 



leff 



We will take the typical time of a non-interrupted (by an uncorrelated relocation) one- 
dimensional effective diffusion to be 



CO 



1 



neff= PjTPjdt ^ -^^^j^-^ . (D3) 

Jo ^ H — ^ 

The last expression is exact in the annealed case but it is only an approximation in the 
quenched regime. One can verify that the error does not exceed 50%, which is sufficient for 
scaling arguments of the type used in the paper. 

2. The effective time of an uncorrelated movement 

Since there are two mechanisms for uncorrelated movement: a jump with a typical time 
Ts and an UIT with a typical time 5 the typical time of the uncorrelated movement is the 
average of T3 and 5 weighted by the relevant probabilities for each process: 

/•oc ip roo jp poo J 

-'I lF^-*-^=i lF^-*-^i 

= / PjPlTdt + 6^ Tie// + = 

n Jo n 

^'^^ff , X A _ ^ ^^// , ^ ^eff \ _ n + 5^ 

A2 
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In the case of sliding 5 is replaced by ttt- 
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